AITopics | bandit approach

Collaborating Authors

bandit approach

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Bandit Approach to Sequential Experimental Design with False Discovery Control

Neural Information Processing SystemsNov-20-2025, 22:32:23 GMT

We propose a new adaptive sampling approach to multiple testing which aims to maximize statistical power while ensuring anytime false discovery control. We consider $n$ distributions whose means are partitioned by whether they are below or equal to a baseline (nulls), versus above the baseline (true positives). In addition, each distribution can be sequentially and repeatedly sampled. Using techniques from multi-armed bandits, we provide an algorithm that takes as few samples as possible to exceed a target true positive proportion (i.e.

bandit approach, name change, sequential experimental design, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Bandit Approach with Evolutionary Operators for Model Selection

Brégère, Margaux, Keisler, Julie

arXiv.org Artificial IntelligenceFeb-7-2024

This paper formulates model selection as an infinite-armed bandit problem. The models are arms, and picking an arm corresponds to a partial training of the model (resource allocation). The reward is the accuracy of the selected model after its partial training. In this best arm identification problem, regret is the gap between the expected accuracy of the optimal model and that of the model finally chosen. We first consider a straightforward generalization of UCB-E to the stochastic infinite-armed bandit problem and show that, under basic assumptions, the expected regret order is $T^{-\alpha}$ for some $\alpha \in (0,1/5)$ and $T$ the number of resources to allocate. From this vanilla algorithm, we introduce the algorithm Mutant-UCB that incorporates operators from evolutionary algorithms. Tests carried out on three open source image classification data sets attest to the relevance of this novel combining approach, which outperforms the state-of-the-art for a fixed budget.

algorithm, configuration, mutant-ucb, (14 more...)

arXiv.org Artificial Intelligence

2402.05144

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Online (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.91)
Information Technology > Data Science > Data Mining > Big Data (0.88)
(3 more...)

Add feedback

A Bandit Approach to Sequential Experimental Design with False Discovery Control

Jamieson, Kevin G., Jain, Lalit

Neural Information Processing SystemsFeb-14-2020, 13:11:44 GMT

artificial intelligence, machine learning, sequential experimental design, (5 more...)

Neural Information Processing Systems

Genre: Research Report (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Online Learning of Robot Soccer Free Kick Plans Using a Bandit Approach

Mendoza, Juan Pablo (Carnegie Mellon University) | Simmons, Reid (Carnegie Mellon University) | Veloso, Manuela (Carnegie Mellon University)

AAAI ConferencesJun-3-2016

This paper presents an online learning approach for teams of autonomous soccer robots to select free kick plans. In robot soccer, free kicks present an opportunity to execute plans with relatively controllable initial conditions. However, the effectiveness of each plan is highly dependent on the adversary, and there are few free kicks during each game, making it necessary to learn online from sparse observations. To achieve learning, we first greatly reduce the planning space by framing the problem as a contextual multi-armed bandit problem, in which the actions are a set of pre-computed plans, and the state is the position of the free kick on the field. During execution, we model the reward function for different free kicks using Gaussian Processes, and perform online learning using the Upper Confidence Bound algorithm. Results from a physics-based simulation reveal that the robots are capable of adapting to various different realistic opponents to maximize their expected reward during free kicks.

bandit approach, online learning, robot soccer free kick plan

AAAI Conferences

Twenty-Sixth International Conference on Automated Planning and Scheduling

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.87)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.80)
Information Technology > Artificial Intelligence > Robots > Soccer Robots (0.53)

Add feedback